Back

Genetics Selection Evolution

Springer Science and Business Media LLC

All preprints, ranked by how well they match Genetics Selection Evolution's content profile, based on 33 papers previously published here. The average preprint has a 0.01% match score for this journal, so anything above that is already an above-average fit. Older preprints may already have been published elsewhere.

1
Selection History Models in a Population under Ongoing Directional Selection

Jansen, A. C. M.; Calus, M. P. L.; Wientjes, Y. C. J.

2025-11-12 genetics 10.1101/2025.11.11.687909 medRxiv
Top 0.1%
69.0%
Show abstract

The aim of animal breeding is to select the genetically best animals in the current generation to improve the performance of future generations for a specific breeding goal. With the continuous shift in breeding goals towards more balanced breeding, new traits may become of interest. Knowledge of the (indirect) selection history of these traits would be insightful before a trait is included in the breeding goal. Two models, BayesS and [G], have been developed to assess the selection history of traits. BayesS estimates a parameter (s) that reflects the relationship between estimated additive effects and minor allele frequency, while [G] calculates the expected genetic change of a trait based on allele frequency changes and estimated additive marker effects. The aim of this study was to evaluate the performance of estimating s-values (based on BayesS) and [G] in an animal breeding context, focusing on their ability to detect selection for a trait with low heritability. Both [G] and s-value estimation were applied to a simulated dataset of a commercial pig breeding program under phenotypic selection, with varying heritabilities (0.05, 0.1, 0.3) and 30 generations of ongoing selection. Overall, both models were able to detect selection, where higher heritabilities and a larger sample size (for s-value estimation) or a larger selection interval (for [G]) resulted in increased detection of selection. The preferred model to identify selection varied based on the available data of the breeding population.

2
BayesR3AD: Joint analysis of additive and dominance in Bayesian mixture models

Yuan, H.; Breen, E. J.; MacLeod, I. M.; Khansefid, M.; Xiang, R.; Goddard, M. E.

2026-02-25 genetics 10.64898/2026.02.23.707560 medRxiv
Top 0.1%
60.5%
Show abstract

BackgroundGenomic prediction in livestock is predominantly based on additive models, even though dominance and other non-additive effects can contribute appreciably to phenotypic variance for fitness and fertility traits. Bayesian mixture models, such as Bayes R, have proven effective for modelling sparse, heterogeneous additive SNP effects, but most implementations do not explicitly accommodate dominance. In this study, we extended BayesR3 to jointly model additive and dominance marker effects within a unified Bayesian mixture framework, denoted BayesR3AD, and used this method to estimate additive and dominance effects for fertility and cow survival (longevity) in Holstein cattle. ResultsUsing real Holstein genotypes (227,942 animals, 74,626 SNPs), we simulated phenotypes with additive and dominance effects. When dominance was present in the simulated data, BayesR3AD improved prediction accuracy of genetic values by +0.1011 (0.6144 vs 0.5133; {approx}19.7% relative) compared with the additive-only BayesR3 model and recovered additive and dominance variance components without bias. Under purely additive simulations, dominance mixture components were effectively empty, confirming that the extended model shrinks unnecessary dominance effects toward zero. In real fertility data, including calving interval (63,378 records) and survival (68,514 records), BayesR3AD estimated small dominance variance ({approx}1-3% of total genetic variance). The model highlighted a very large additive loci at 57.82 Mb on BTA18 for both calving interval and survival. concordant with previous GWAS studies of Holstein fertility. Additionally, a large dominance effect was found at 44.37 Mb on BTA18 for calving interval implicating a heterozygote advantage that increases fertility. ConclusionsBayesR3AD provides a practical extension of BayesR3 that captures both additive and dominance contributions to genomic prediction. The method is robust, reverting effectively to the additive model when dominance is absent, while delivering accurate variance decomposition, and potential gains in prediction accuracy when dominance is present. Application to Holstein fertility traits demonstrates that dominance can be detected and quantified without compromising additive inference, supporting improved prediction of total genetic merit. While validated in cattle, BayesR3AD can be directly applied to other species to better model and predict traits related to fitness.

3
Impact of genomic selection on genetic diversity in five local European cattle breeds

Bonifazi, R.; Meuwissen, T. H.; Croiseau, P.; Restoux, G.; Minery, S.; Windig, J.

2025-09-08 genetics 10.1101/2025.09.08.674844 medRxiv
Top 0.1%
51.9%
Show abstract

Genomic selection (GS) has revolutionised animal breeding and accelerated genetic gains in breeding programs. While GS has become common in major dairy cattle breeds, its implementation in local breeds has begun only more recently or is still in progress. However, the introduction of GS in some major breeds has also been associated with increased inbreeding rates, raising concerns about the potential effects of GS on the genetic diversity in smaller or local breeds. Our aim was to investigate the impact of GS on genetic diversity in five (small) local cattle breeds from three European countries. The five breeds evaluated were: MRY (from the Netherlands), Norwegian Red (from Norway), Abondance, Tarentaise, and Vosgienne (from France). We investigated changes in population demographic structure, as well as trends and rates of kinship and inbreeding, using both pedigree- and genomic-based measures. The population size varied depending on the breed, with Vosgienne being the smallest and Norwegian Red being the largest. The dataset included 4,645 MRY, 193,489 Norwegian Red, 16,427 Abondance, 8,882 Tarentaise, and 4,466 Vosgienne genotyped animals for more than 40,000 single-nucleotide polymorphisms. Overall, following the implementation of GS in these breeds, we observed a reduction in generation intervals for sires, fewer calves that later became sires, and, for the French breeds, a broader sire usage. Such changes were likely due to GS enabling the preselection and screening of more young bulls. Additionally, we observed a more balanced contribution of the top ten sires after the introduction of GS. Although changes in inbreeding and kinship rates occurred after the introduction of GS, there was no consistent pattern across breeds: rates increased in MRY and Tarentaise, but decreased in Norwegian Red, Abondance, and Vosgienne. Our study suggests that changes and increases in inbreeding rates may occur after the introduction of GS, although they may not be directly due to the introduction of GS per se, but rather due to population management strategies, such as optimal contribution selection or other breeding practices implemented at the nucleus level. Our findings emphasise the importance of monitoring changes in both genetic diversity and population demographic structure after implementing GS in local breeds, as well as adjusting breeding strategies when needed to ensure long-term sustainability. Interpretive SummaryGenomic selection (GS) has transformed cattle breeding. We investigated changes in population demographic structure and genetic diversity in five European breeds after the introduction of GS. Changes in inbreeding and kinship rates were not consistent across breeds, with both increases and decreases observed. Genetic management strategies, such as optimal contribution selection, had a greater impact on maintaining genetic diversity than the introduction of GS per se. These findings highlight the need to monitor changes in genetic diversity and population demographic structure after the implementation of GS and, when needed, to adapt management strategies to ensure long-term sustainability.

4
Trajectories of genetic correlations in populations under selection: from theory to a case-study

Cuyabano, B. C.; Motta, M. R.; Vandenplas, J.; Garcia, N. L.; Shokor, F.; Croiseau, P.; Boichard, D.; Aguerre, S.; Mattalia, S.

2025-03-15 genetics 10.1101/2025.03.13.643026 medRxiv
Top 0.1%
43.3%
Show abstract

BackgroundBreeding programs select for multiple commercial traits, aiming to achieve genetic progress for all. Often, selection is based on a selection index, i.e. a linear combination of traits with weights defined by, among other information, the genetic correlation between traits. These correlations are typically estimated as a static parameter, and assumed equal to all individuals and generations. While research on the consequences of selection to genetic variances (Bulmer effect) is widely available, only a few studies focused on the consequences of selection to genetic correlations. Our study extended the already existing inferences about how selection affects genetic variances, to how multi-trait selection affects genetic correlations. In order to further our understanding of genetic correlations, we also proposed an alternative method to calculate genetic correlations between traits at the individual level, called by us as individualized sire genetic correlation (iSGC), obtained through the estimated breeding values (EBV) from evaluated daughters. Lastly, a case-study was performed on thirty years of data from the French Holstein dairy cattle population, for five traits studied pairwise: milk and protein yield, milking speed, somatic cell score, and cow conception rate. ResultsTheory revealed that multi-trait selection leads to an attenuation (decrease) of positive genetic correlations, with potential to revert them to negative values, if initially low. Uncorrelated traits will become negatively correlated, and negative genetic correlations will be either intensified or attenuated (decrease or increase, respectively), depending on selection intensity, weights applied to the selection index, and the initial genetic correlation. ConclusionBoth theory and empirical results on real data confirm that selection does change the genetic correlation between traits in a population under selection. Moreover, empirical trajectories of the iSGC were in better agreement with the theory, than trajectories of populational genetic correlations. The iSGC searches for individual-specific patterns of correlations, and since it is measured on sires through the EBV of their daughters, it also considers the recombination of the genetic background. Along with the fact that trajectories of iSGC were in better agreement with theory, we believe it to be a potentially less biased measure of genetic correlations between traits.

5
Impact of genomic preselection on subsequent genetic evaluations with ssGBLUP - using real data from pigs

Jibrila, I.; Vandenplas, J.; ten Napel, J.; Bergsma, R.; Veerkamp, R. F.; Calus, M. P. L.

2021-06-19 genetics 10.1101/2021.06.18.449002 medRxiv
Top 0.1%
41.9%
Show abstract

BackgroundEmpirically assessing the impact of preselection on subsequent genetic evaluations of preselected animals requires comparison of scenarios taking into account different approaches, including scenarios without preselection. However, preselection almost always takes place in animal breeding programs, so it is difficult to have a dataset without preselection. Hence most studies on preselection used simulated datasets, concluding that genomic estimated breeding values (GEBV) from subsequent single-step genomic best linear unbiased prediction (ssGBLUP) evaluations are unbiased. The aim of this study was to investigate the impact of genomic preselection (GPS) on accuracy and bias in subsequent ssGBLUP evaluations, using data from a commercial pig breeding program. MethodsWe used data on four pig production traits from one sire line and one dam line. The traits are average daily gain during performance testing, average daily gain throughout life, backfat thickness, and loin depth. As these traits had different weights in the breeding goals of the two lines, we analyzed the two lines separately. Per line, we had a reference GPS scenario which kept all available data, against which the next two scenarios were compared. We then implemented two other scenarios with additional layers of GPS by removing all animals without progeny either i) only in the validation generation, or ii) in all generations. We conducted subsequent ssGBLUP evaluations per GPS scenario, utilizing all the data remaining after implementing the GPS scenario. In computing accuracy and bias, we compared GEBV against progeny yield deviations of validation animals. ResultsResults for all traits in both lines showed marginal loss in accuracy due to the additional layers of GPS. Average accuracy across all GPS scenarios in both lines was 0.39, 0.47, 0.56, and 0.60 respectively for the four traits considered in this study. Bias was largely absent, and when present did not differ greatly among corresponding GPS scenarios. ConclusionAs preselection generally has the same effect in animal breeding programs, we concluded that impact of preselection is generally minimal on accuracy and bias in subsequent ssGBLUP evaluations of selection candidates in pigs and in other animal breeding programs.

6
Accounting for nuclear and mito genome in dairy cattle breeding - a simulation study

Mafra Fortuna, G.; Zumbach, B. J.; Johnsson, M.; Pocrnic, I.; Gorjanc, G.

2023-11-21 genetics 10.1101/2023.11.20.567907 medRxiv
Top 0.1%
41.7%
Show abstract

Mitochondria play a significant role in numerous cellular processes through proteins encoded by both nuclear genome (nDNA) and mito genome (mDNA). While the variation in nDNA is influenced by mutations and recombination of parental genomes, the variation in mDNA is solely driven by mutations. In addition, mDNA is inherited in a haploid form, from the dam. Cattle populations show significant variation in mDNA between and within breeds. Past research suggests that variation in mDNA accounts for 1-5% of the phenotypic variation in dairy traits. Here we simulated a dairy cattle breeding program to assess the impact of accounting for mDNA variation in pedigree-based and genome-based genetic evaluations on the accuracy of estimated breeding values for mDNA and nDNA components. We also examined the impact of alternative definitions of breeding values on genetic gain, including nDNA and mDNA components that both impact phenotype expression, but mDNA is inherited only maternally. We found that accounting for mDNA variation increased accuracy between +0.01 and +0.05 for different categories of animals, especially for young bulls (+0.05) and females without genotype data (between +0.01 and +0.03). Different scenarios of modelling and breeding value definition impacted genetic gain. The standard approach of ignoring mDNA variation achieved competitive genetic gain. Modelling, but not selecting on mDNA expectedly reduced genetic gain, while optimal use of mDNA variation recovered the genetic gain.

7
Predicting nonlinear genetic relationships between traits in multi-trait evaluations by using a GBLUP-assisted Deep Learning model

Shokor, F.; Croiseau, P.; Gangloff, H.; Saintilan, R.; Tribout, T.; Mary-Huard, T.; C.D Cuyabano, B.

2024-03-27 genomics 10.1101/2024.03.23.585208 medRxiv
Top 0.1%
40.3%
Show abstract

BackgroundGenomic prediction aims to predict the breeding values of multiple complex traits, usually assumed to be normally distributed by the largely used statistical methods, thus imposing linear genetic correlations between traits. While statistical methods are of great value for genomic prediction, these methods do not account for nonlinear genetic relationships between traits. If such relationships exist, although statistical models do perform a fair linear approximation, their prediction accuracy is limited due to the nonlinearity. Deep learning (DL) is a promising methodology for predicting multiple complex traits, in scenarios where nonlinear genetic relationships are present, due to its capacity to capture complex and nonlinear patterns in large data. We proposed a novel hybrid DLGBLUP model which uses the output of the traditional GBLUP, and enhances its PGV by accounting for nonlinear genetic relationships between traits using DL. Using simulated data, we compared the accuracy of the PGV obtained with the proposed hybrid DLGBLUP model, a DL model, and the traditional GBLUP model - the latter being our baseline reference. ResultsWe found that both DL and DLGBLUP models either outperformed GBLUP, or presented equally accurate PGV, with a particular greater accuracy for traits presenting a strongly characterized nonlinear genetic relationship. Overall, DLGBLUP presented the highest prediction accuracy, up to 0.2 points higher than GBLUP, and smallest mean squared error of the PGV for all traits. Additionally, we evolved a base population over seven generations and compared the genetic progress when selecting individuals based on the additive PGV obtained by either DL, DLGBLUP or GBLUP. For all traits with a nonlinear genetic relationship, after the fourth generation, the observed genetic gain when selection was based on the additive PGV from GBLUP was always inferior to the one achieved from either DL or DLGBLUP. ConclusionsThe integration of DL into genomic prediction enables the possibility of modeling nonlinear relationships between traits. Moreover, by identifying these nonlinear genetic relationships, our DL and DLGBLUP models improved prediction accuracy, when compared to GBLUP. The possibility of nonlinear relationships between traits offers a different perspective into multi-trait evaluations and prediction, as well as into the traits evolution over generations, with potential to further improve selection strategies in commercial livestock breeding programs. Moreover, DLGBLUP shows that DL can be used as a complement to statistical methods, by enhancing their performance.

8
Population history of Swedish cattle breeds: estimates and model checking

Adepoju, D.; Ohlsson, J. I.; Klingström, T.; Rius-Vilarrasa, E.; Johansson, A. M.; Johnsson, M.

2024-10-04 genetics 10.1101/2024.10.03.616479 medRxiv
Top 0.1%
40.1%
Show abstract

In this work, we use linkage disequilibrium-based methods to estimate recent population history from genotype data in Swedish cattle breeds, as well as international Holstein and Jersey cattle data for comparison. Our results suggest that these breeds have been effectively large up until recently, when they declined around the onset of systematic breeding. The inferred trajectories were qualitatively similar, with a large historical population and one decline. We used population genetic simulation to check the inferences. When comparing simulations from the inferred population histories to real data, the proportion low-frequency variants in real data was different than was implied by the inferred population histories, and there was somewhat higher genomic inbreeding in real data than implied by the inferred histories. The inferred population histories imply that much of the variation we see today is transient, and it will be lost as the populations settle into a new equilibrium, even if efforts to maintain effective population size at current levels are successful.

9
The benefits and perils of import in small cattle breeding programs

Obsteter, J.; Jenko, J.; Pocrnic, I.; Gorjanc, G.

2022-12-12 genetics 10.1101/2022.12.09.519737 medRxiv
Top 0.1%
38.7%
Show abstract

Small breeding programs are limited in achieving competitive genetic gain and prone to high rates of inbreeding. Thus, they often import genetic material to increase genetic gain and to limit the loss of genetic variability. However, the benefit of import depends on the strength of genotype by environment interaction. It also also diminishes the relevance of domestic selection and the use of domestic breeding animals. Introduction of genomic selection has potentially execerbated this issue, but is also opening the potential for smaller breeding program. The aim of this paper was to determine when and to what extent do small breeding programs benefit from import. We simulated two cattle breeding programs differing in selection parameters representing a large foreign and a small domestic breeding program that differ in the initial genetic mean and annual genetic gain. We evaluated a control scenario without the use of foreign sires in the domestic breeding program and 20 scenarios that varied the percentage of domestic dams mated with foreign sires, the genetic correlation between the breeding programs (0.8 or 0.9), and the time of implementing genomic selection in the domestic compared to the foreign breeding program (concurrently or with a 10-year delay). We compared the scenarios based on the genetic gain and genic standard deviation. Finally, we partitioned breeding values and genetic trends of the scenarios to quantify the contribution of domestic selection and import to the domestic genetic gain. The simulation revealed that when both breeding programs implemented genomic selection simultaneously, the use of foreign sires increased domestic genetic gain only when genetic correlation was 0.9. In contrast, when the domestic breeding program implemented genomic selection with a 10-year delay, genetic correlation of 0.8 sufficed for a positive impact of import. In that scenario, domestic genetic gain increased with the increasing use of foreign sires but with a diminishing return. The partitioning analysis revealed that the contribution of import expectedly increased with the increased use of foreign sires. However, the increase did not depend on the genetic correlation and was not proportional to the increase in domestic genetic gain. This means that a small breeding program could be overly relying on import with diminishing returns for the genetic gain and marginal benefit for the genetic variability. The benefit of import depends on an interplay of genetic correlation, extent of using foreign sires, and a breeding scheme. It is therefore crucial that small breeding programs assess the possible benefits of import beyond domestic selection. The benefit of import should be weighted against the perils of decreased use of domestic sires and decreased contribution and value of domestic selection.

10
Comparison of two multi-trait association testing methods and sequence-based fine mapping of six QTL in Swiss Large White pigs

Noskova, A.; Mehrotra, A.; Kadri, N. K.; Llores-Villas, A.; Neuenschwander, S.; Hofer, A.; Pausch, H.

2022-12-15 genomics 10.1101/2022.12.13.520268 medRxiv
Top 0.1%
37.7%
Show abstract

BackgroundGenetic correlations between complex traits suggest that pleiotropic variants contribute to trait variation. Genome-wide association studies (GWAS) aim to uncover the genetic underpinnings of traits. Multivariate association testing and the meta-analysis of summary statistics from single-trait GWAS enable detecting variants associated with multiple phenotypes. In this study, we used array-derived genotypes and phenotypes for 24 reproduction, production, and conformation traits to explore differences between the two methods and used imputed sequence variant genotypes to fine-map six quantitative trait loci (QTL). ResultsWe considered genotypes at 44,733 SNPs for 5,753 pigs from the Swiss Large White breed that had deregressed breeding values for 24 traits. Single-trait association analyses revealed eleven QTL that affected 15 traits. Multi-trait association testing and the meta-analysis of the single-trait GWAS revealed between 3 and 6 QTL, respectively, in three groups of traits. The multi-trait methods revealed three loci that were not detected in the single-trait GWAS. Four QTL that were identified in the single-trait GWAS, remained undetected in the multi-trait analyses. To pinpoint candidate causal variants for the QTL, we imputed the array-derived genotypes to the sequence level using a sequenced reference panel consisting of 421 pigs. This approach provided genotypes at 16 million imputed sequence variants with a mean accuracy of imputation of 0.94. The fine-mapping of six QTL with imputed sequence variant genotypes revealed four previously proposed causal mutations among the top variants. ConclusionsOur findings in a medium-size cohort of pigs suggest that multivariate association testing and the meta-analysis of summary statistics from single-trait GWAS provide very similar results. Although multi-trait association methods provide a useful overview of pleiotropic loci segregating in mapping populations, the investigation of single-trait association studies is still advised, as multi-trait methods may miss QTL that are uncovered in single-trait GWAS.

11
Evaluating genotyping strategies for a small managed population with simulation

Martin, A. A. A.; Schoenebeck, J.; Clements, D. N.; Lewis, T.; Wiener, P.; Gorjanc, G.

2025-01-25 genomics 10.1101/2025.01.23.634495 medRxiv
Top 0.1%
37.2%
Show abstract

BackgroundCollecting genomic information is crucial to advance breeding for complex traits such as health, welfare, and behaviour in domesticated populations. For that purpose, different data collection scenarios can be envisioned based on the number of individuals, the number of markers, and the genotyping technology. This study developed a simulation framework, based on a service dog population, aiming to identify an optimal and cost-effective genotyping strategy that would support the implementation of genomic selection, investigation of the genetic architecture of traits of interest, and track loci of interest. MethodsWe simulated a population based on the existing pedigree, using the gene drop method in AlphaSimR. The existing pedigree was extended with additional progeny generations to evaluate the outcomes of different genotyping strategies in the future. We generated genotype data based on existing high-coverage whole-genome sequences (WGS) for the current breeding dogs and evaluated different scenarios for genotyping the progeny. The genotyping options considered SNP arrays of various densities and WGS callsets produced from different sequencing depths. We then phased and imputed the genotype data to high-coverage WGS using AlphaPeel. ResultsAll scenarios were compared based on individual imputation accuracy against the simulated true whole-genome genotype. Averaged over five generations of simulated progeny, low-pass sequencing (0.5 to 2X depth) achieved accuracies of 0.998 to 0.999. The accuracy of SNP array genotyping (25K to 710K markers) was lower, with means of 0.911 to 0.938. ConclusionsOur simulation was tailored to identify the most cost-effective and efficient strategy for downstream use in genomic selection and genetic research into traits and loci of interest. Low-pass sequencing outperformed SNP array genotyping in imputation accuracy of whole-genome genotypes as expected. Additionally, low-pass sequencing technology was the most affordable genotyping approach currently available for dogs. Thus, it appears to be the optimal choice for balancing the goals of regimented breeding programmes such as those that produce service dogs. This simulation framework could also be adapted to address other objectives for breeding organisations working with small populations.

12
A method for partitioning trends in genetic mean and variance to understand breeding practices

Oliveira, T. d. P.; Obsteter, J.; Pocrnic, I.; Gorjanc, G.

2022-01-12 genetics 10.1101/2022.01.10.475603 medRxiv
Top 0.1%
33.5%
Show abstract

BackgroundIn breeding programmes, the observed genetic change is a sum of the contributions of different groups of individuals. Quantifying these sources of genetic change is essential for identifying the key breeding actions and optimizing breeding programmes. However, it is difficult to disentangle the contribution of individual groups due to the inherent complexity of breeding programmes. Here we extend the previously developed method for partitioning genetic mean by paths of selection to work both with the mean and variance of breeding values. MethodsWe first extended the partitioning method to quantify the contribution of different groups to genetic variance assuming breeding values are known. Second, we combined the partitioning method with the Markov Chain Monte Carlo approach to draw samples from the posterior distribution of breeding values and use these samples for computing the point and interval estimates of partitions for the genetic mean and variance. We implemented the method in the R package AlphaPart. We demonstrated the method with a simulated cattle breeding programme. ResultsWe showed how to quantify the contribution of different groups of individuals to genetic mean and variance. We showed that the contributions of different selection paths to genetic variance are not necessarily independent. Finally, we observed some limitations of the partitioning method under a misspecified model, suggesting the need for a genomic partitioning method. ConclusionWe presented a partitioning method to quantify sources of change in genetic mean and variance in breeding programmes. The method can help breeders and researchers understand the dynamics in genetic mean and variance in a breeding programme. The developed method for partitioning genetic mean and variance is a powerful method for understanding how different paths of selection interact within a breeding programme and how they can be optimised.

13
Annotation and assessment of functional variants in regulatory regions using epigenomic data in farm animals

Ma, R.; Kuang, R.; Zhang, J.; Sun, J.; Xu, Y.; Zhou, X.; Han, Z.; Hu, M.; Wang, D.; Luan, Y.; Fu, Y.; Zhang, Y.; Li, X.; Zhu, M.; Xiang, T.; Zhao, S.; Shi, M.; Zhao, Y.

2024-11-07 genomics 10.1101/2024.02.06.578787 medRxiv
Top 0.1%
32.3%
Show abstract

BackgroundUnderstanding the functional impact of genetic variants is essential for advancing animal genomics and improving livestock breeding. Variants that disrupt transcription factor (TF) motifs provide a means to assess functional potential, but the lack of TF ChIP-seq data for farm animals presents a challenge. ResultsTo address this, we curated nearly 900 epigenomic datasets from 10 farm animal species and annotated eight regulatory regions to assess how variants affect TF motifs. Over 127 million candidate functional variants were classified into five functional confidence categories across the species. Variants with high confidence were enriched in eQTLs and trait-associated SNPs, showing greater potential to affect gene expression and phenotypes. Incorporating these functional variants into genomic prediction models improved the accuracy of Estimated Breeding Values (EBVs). Active variants also revealed trait-related tissues, and single-cell RNA sequencing (scRNA-seq) identified the cell types most associated with production traits. To facilitate research, we developed the Integrated Functional Mutation (IFmut) platform, enabling users to explore variant functions easily. Our study provides a flexible platform and resource for studying genomic variation in farm animals, setting a new standard for research and breeding strategies. ConclusionThe results indicated that evaluating functional potential by annotating and categorizing variants that interfere with transcription factor motifs can help elucidate changes in gene expression and phenotype. By focusing on high-confidence variants enriched in eQTL and trait-associated SNPs, it improves the accuracy of genomic predictions in research and breeding strategies.

14
Genome landscape and genetic architecture of recombination in domestic goats (Capra Hircus)

Etourneau, A.; Rupp, R.; Servin, B.

2025-05-15 genetics 10.1101/2025.05.15.654186 medRxiv
Top 0.1%
29.5%
Show abstract

BackgroundRecombination is a fundamental biological process, both in participating to the creation of viable gametes and as a driver of genetic diversity. Characterising recombination is therefore of strong interest in breeding populations. In this study, we used [~]50K genotyped data and pedigree from two French populations (Alpine and Saanen) of domestic goats (Capra hircus) to build sex-specific recombination maps, and to explore the genetic basis of two recombination phenotypes: genome-wide recombination rate (GRR) and intra-chromosomal shuffling. ResultsSex-specific recombination maps showed higher recombination in males than females for both breeds (Alpine autosomal map size = 35.1M in males and 30.5M in females; Saanen map size = 34.0M in males and 29.0M in females). Heterochiasmy is particularly notable on small chromosomes, and at both ends of the chromosomes. Yet, no difference in shuffling has been found between populations. Genetic parameters on recombination phenotypes could only be estimated in males, due to lack of data in females. Both phenotypes are significantly heritable (h{superscript 2}=0.12 (0.03) for GRR and h{superscript 2}=0.034 (0.015) for shuffling, when pooling breeds). GWAS on male GRR identified several significant loci, including RNF212, RNF212B and SSH1, the last one being a novel locus for this phenotype. Correlation of SNP effects between breeds is low for both recombination phenotypes (0.25 for GRR and 0.04 for shuffling), indicating different genetic determinants in the two breeds. ConclusionsThis study contributes to the understanding of recombination evolution in ruminants, both between breeds and species.

15
Application of a French cattle pangenome, from structural variant discovery to association studies on key phenotypes

Sorin, V.; Naji, M.-M.; Birbes, C.; Grohs, C.; Escouflaire, C.; Fritz, S.; Eche, C.; Marcuzzo, C.; Suin, A.; Donnadieu, C.; Gaspin, C.; Iampietro, C.; Milan, D.; Drouilhet, L.; Tosser-Klopp, G.; Boichard, D.; Klopp, C.; Sanchez, M.-P.; Boussaha, M.

2025-04-18 genetics 10.1101/2025.04.15.648672 medRxiv
Top 0.1%
28.7%
Show abstract

BackgroundThe current cattle reference genome assembly, a pseudo-linear sequence produced using sequences from a single Hereford cow, represent a limit when performing genetic studies, especially when investigating the whole spectrum of genetic variations within the species. Detecting structural variations (SVs) poses significant challenges when relying solely on conventional methods of short or long-read sequence mapping to the current bovine genome assembly. ResultsIn this study, we used long-reads (LR) and bioinformatic tools to construct a comprehensive bovine pangenome incorporating genetic diversity of 64 good quality de novo genome assemblies representing 14 French dairy and beef cattle breeds. Using a combination of complementary approaches, we explored the pangenome graph and identified 2.563 Gb of sequences common to all samples, and cumulated 0.295 Gb of variable sequences. Notably, we discovered 0.159 Gb of novel sequences not present in the current Hereford reference genome assembly. Our analysis also revealed 109,275 SVs, of which 84,612 were bi-allelic, including 21,840 insertions and 21,340 deletions. Genome-wide association studies using SNPs and a panel of 221 SVs, shared between the pangenome and the EuroGMD chip, revealed several well-known QTLs across the genome for the Holstein, Montbeliarde and Normande breeds. Among those, a QTL on chromosome 11 presents an SV with a highly significant effect on stature in the Holstein breed. This SV is a 6.2 kb deletion affecting the 5UTR, first exon and part of first intron of MATN3 gene, suggesting a potential regulatory and coding effect. ConclusionsOur study provides new insights into the genetic diversity of 14 French dairy and beef breeds and highlights the utility of pangenome graphs in capturing structural variation. The identified SV associated with stature highlights the importance of integrating SVs into GWAS for a more comprehensive understanding of complex traits.

16
Comparison of breeding strategies for the creation of a synthetic pig line

Ganteil, A.; Pook, T.; Rodriguez-Ramilo, S. T.; Ligonesche, B.; Larzul, C.

2021-09-24 genetics 10.1101/2021.09.22.461330 medRxiv
Top 0.1%
28.5%
Show abstract

Creating a new synthetic line by crossbreeding means complementary traits from pure breeds can be combined in the new population. Although diversity is generated during the crossbreeding stage, in this study, we analyze diversity management before selection starts. Using genomic and phenotypic data from animals belonging to the first generation (G0) of a new line, different simulations were run to evaluate diversity management during the first generations of a new line and to test the effects of starting selection at two alternative times, G3 and G4. Genetic diversity was characterized by allele frequency, inbreeding coefficients based on genomic and pedigree data, and expected heterozygosity. Breeding values were extracted at each generation to evaluate differences in starting selection at G3 or G4. All simulations were run for ten generations. A scenario with genomic data to manage diversity during the first generations of a new line was compared with a random and a selection scenario. As expected, loss of diversity was higher in the selection scenario, while the scenario with diversity control preserved diversity. We also combined the diversity management strategy with different selection scenarios involving different degrees of diversity control. Our simulation results show that a diversity management strategy combining genomic data with selection starting at G4 and a moderate degree of diversity control generates genetic progress and preserves diversity.

17
Impact of SNP calling quality on the detection of transmission ratio distortion in goats

Luigi-Sierra, M.; Casellas, J.; Martinez, A.; Delgado, J. V.; Alvarez, J. F.; Such, F. X.; Jordana, J.; Amills, M.

2021-06-10 genetics 10.1101/2021.06.09.447792 medRxiv
Top 0.1%
28.3%
Show abstract

Transmission ratio distortion (TRD) is the preferential transmission of one specific allele to offspring at the expense of the other one. The existence of TRD is mostly explained by the segregation of genetic variants with deleterious effects on the developmental processes that go from the formation of gametes to fecundation and birth. A few years ago, a statistical methodology was implemented in order to detect TRD signals on a genome-wide scale as a first step to uncover the biological basis of TRD and reproductive success in domestic species. In the current work, we have analyzed the impact of SNP calling quality on the detection of TRD signals in a population of Murciano-Granadina goats. Seventeen bucks and their offspring (N=288) were typed with the Goat SNP50 BeadChip, while the genotypes of the dams were lacking. Performance of a genome-wide scan revealed the existence of 36 SNPs showing significant evidence of TRD. When we calculated GenTrain scores for each one of the SNPs, we observed that 25 SNPs showed scores below 0.8. The allele frequencies of these SNPs in the offspring were not correlated with the allele frequencies estimated in the dams with statistical methods, thus evidencing that flawed SNP calling quality might lead to the detection of spurious TRD signals. We conclude that, when performing TRD scans, the GenTrain scores of markers should be taken into account to discriminate SNPs that are truly under TRD from those yielding spurious signals due to technical problems.

18
Disentangling the dynamics of energy allocation to provide a proxy of robustness in fattening pigs

LENOIR, G.; Flatres-Grall, L.; Munoz-Tamayo, R.; DAVID, I.; Friggens, N. C.

2022-10-21 genetics 10.1101/2022.10.19.512827 medRxiv
Top 0.1%
28.0%
Show abstract

BackgroundThere is a growing need to improve robustness characteristics in fattening pigs, but this trait is difficult to phenotype. Our first objective was to develop a robustness proxy on the basis of modelling of longitudinal energetic allocation coefficient to growth for fattening pigs. Consequently, the environmental variance of this allocation coefficient was considered as a proxy of robustness. The second objective was to estimate its genetic parameters and correlation with traits under selection as well with phenotypes routinely collected on farms. A total of 5848 pigs, from Pietrain NN paternal line, were tested at the AXIOM boar testing station (Azay-sur-Indre, France) from 2015 to 2022. This farm was equipped with automatic feeding system, recording individual weight and feed intake at each visit. We used a dynamic linear regression model to characterize the evolution of the allocation coefficient between cumulative net energy available, estimated from feed intake, and cumulative weight gain during fattening period. Longitudinal energetic allocation coefficients were analysed using a two-step approach, to estimate both its genetic variance and the genetic variance in the residual variance, trait LSR. ResultsThe LSR trait, that could be interpreted as an indicator of the response of the animal to perturbations/stress, showed low heritability (0.05{+/-}0.01). The trait LSR had high favourable genetic correlations with average daily growth (-0.71{+/-}0.06) and unfavourable with feed conversion ratio (-0.76{+/-}0.06) and residual feed intake (-0.83{+/-}0.06). The analysis of the relationship between estimated breeding values (EBV) LSR quartiles and phenotypes routinely collected on farms shows the most favourable situation for animals from quartile with the weakest EBV LSR, i.e., the most robust. ConclusionsThese results show that selection for robustness based on deviation from energetic allocation coefficient to growth can be considered in breeding programs for fattening pigs.

19
Genomic selection accuracy and bias using imputed genotypes on growth, welfare and fitness traits in two Pekin duck lines

Matika, O.; Tarsani, E. A.; McIntosh, K.; Desire, S. G.; Kebede, F. G.; Talenti, A. G.; Rae, A. M.; Kranis, A.; Watson, K. A.

2025-12-26 genomics 10.64898/2025.12.24.696349 medRxiv
Top 0.1%
27.9%
Show abstract

The current study investigated the genomic selection accuracies and biases estimates from two commercial Pekin duck lines reared under commercial breeding practices. A large dataset of 26K duck records comprising both phenotype and imputed genotype information (60K chip) were analysed for growth, welfare and primary feather length traits. First, we employed mixed linear models with relationship matrices computed from the pedigree (BLUP) or markers (GBLUP) to estimate the variance components and breeding values. Then, we estimated the selection accuracies and selection biases to assess the more appropriate models. Our results showed moderately high imputation accuracies of 0.93 and 0.92 for lines A and D respectively. In both lines, the heritability estimates obtained using the pedigree were generally higher than using genomic markers in all traits considered. These ranged for juvenile weight (JW) from 0.22{+/-}0.01 vs 0.25{+/-}0.01 in line A vs line D using marker information to 0.39{+/-}0.02 to 0.50{+/-}0.02 using the pedigree in line A vs line D for slaughter body weight (BW). We observed very low estimates of heritability for gait 0.07{+/-}0.01 using markers in both lines. Breast muscle depth (BD) also had lower estimates of 0.15-0.16 using markers. For line A, the genomic predictions were generally higher when using the G-matrix than the A-matrix with the highest prediction was for BW (r2=0.68-0.70) and JW with r2 of 0.49. The estimates for gait and foot pad dermatitis (FPD) were greatly improved by using the G-Matrix at 0.58 vs 0.24 and 0.68 vs 0.44 respectively for markers vs pedigree information. For line D, the same improvements for G-Matrix vs A-Matrix were observed with estimates for BD being similar in the two lines. However, for BD the G-Matrix greatly improved the estimates from 0.50 to 0.71 unlike in line A where they remained at 0.50. The bias in line A were minimal (0.01- 0.19) using the G-Matrix compared to 0.02- 0.41 when using A-Matrix. The highest observed bias was for JW followed by BD for the G-matrix whereas when using the A-matrix we observed higher biases in many traits (JW, BW, BD and gait). The biases for line D were generally lower for the G-matrix (0.02 - 0.17 vs 0.00 - 0.19) than those observed in line A using markers whereas higher biases were observed using the pedigree (0.01 - 0.37). Current findings pinpointed that all traits were heritable with higher prediction accuracies and lower biases when using GBLUP as opposed to traditional BLUP. The present study demonstrates the effectiveness of GBLUP for improving prediction accuracy and reducing bias in selection traits of Pekin ducks, particularly for traits with low heritability. Author SummaryThe study explored genomic selection in two commercial Pekin duck lines. Using a large dataset of 26,000 records, including phenotype and genotype data, researchers analyzed growth, welfare, and feather length traits. They applied statistical models to assess variance components and breeding values, comparing traditional pedigree-based methods (BLUP) with genomic marker-based methods (GBLUP). Results showed high imputation accuracies (93% for line A and 92% for line D). Heritability estimates varied, with genomic markers generally producing lower estimates than pedigrees, except for traits like gait and breast muscle depth where genomic predictions were superior. For example, line A showed higher accuracy using genomic data for body weight and juvenile weight. Overall, genomic predictions (GBLUP) provided higher accuracy and lower bias compared to traditional methods, especially for traits with low heritability. This highlights the effectiveness of GBLUP in improving selection processes in Pekin ducks.

20
AlphaImpute2: Fast and accurate pedigree and population based imputation for hundreds of thousands of individuals in livestock populations

Whalen, A.; Hickey, J. M.

2020-09-17 genetics 10.1101/2020.09.16.299677 medRxiv
Top 0.1%
25.9%
Show abstract

In this paper we present a new imputation algorithm, AlphaImpute2, which performs fast and accurate pedigree and population based imputation for livestock populations of hundreds of thousands of individuals. Genetic imputation is a tool used in genetics to decrease the cost of genotyping a population, by genotyping a small number of individuals at high-density and the remaining individuals at low-density. Shared haplotype segments between the high-density and low-density individuals can then be used to fill in the missing genotypes of the low-density individuals. As the size of genetics datasets have grown, the computational cost of performing imputation has increased, particularly in agricultural breeding programs where there might be hundreds of thousands of genotyped individuals. To address this issue, we present a new imputation algorithm, AlphaImpute2, which performs population imputation by using a particle based approximation to the Li and Stephens which exploits the Positional Burrows Wheeler Transform, and performs pedigree imputation using an approximate version of multi-locus iterative peeling. We tested AlphaImpute2 on four simulated datasets designed to mimic the pedigrees found in a real pig breeding program. We compared AlphaImpute2 to AlphaImpute, AlphaPeel, findhap version 4, and Beagle 5.1. We found that AlphaImpute2 had the highest accuracy, with an accuracy of 0.993 for low-density individuals on the pedigree with 107,000 individuals, compared to an accuracy of 0.942 for Beagle 5.1, 0.940 for AlphaImpute, and 0.801 for findhap. AlphaImpute2 was also the fastest software tested, with a runtime of 105 minutes a pedigree of 107,000 individuals and 5,000 markers was 105 minutes, compared to 190 minutes for Beagle 5.1, 395 minutes for findhap, and 7,859 minutes AlphaImpute. We believe that AlphaImpute2 will enable fast and accurate large scale imputation for agricultural populations as they scale to hundreds of thousands or millions of genotyped individuals.